The present analysis aims to dissect the genotype by environment interaction study (aka GEI) using a data set from the Dry Beans breeding program - MSU.
The trait in the study is the yield per plot (lb/plot) adjusted to the international measurements (Kg/ha). A previous report was done to describe further details in the raw data adjustment analysis. Plant maturity also will be investigated in order to check the MTME model effects.
In this vignette, different types of analysis will be performed in order to study the GEI in the Multi-Environment-Trials (MET) data to provide better varieties recommendations to the breeding program.
The core functions in this vignette are available at (statgenGxE?) and (metan?) R packages. The availability of functions in these packages is based on the analyses described in (Malosetti2013?) and (Olivoto2019?). Further suggested reading is (vanEeuwijk2016?), (ColombariFilho?).
The following types of analysis will be done using the packages described previously: (this topic list will be edited, but for now we have a reference list)
rm(list=ls())
my.path <- dirname(rstudioapi::getActiveDocumentContext()$path)
setwd(my.path)
# R setup ###########
options("scipen"=999,"digits" =5) # set numbering format
#> Online License checked out Wed Jan 12 14:47:54 2022
This vignette some table will be built with (pkgdown?). All tables will be produced with the package DT using the following function.
library(DT) # Used to make the tables
# Function to make HTML tables
print_table <- function(table, rownames = FALSE, digits = 6, ...){
df <- datatable(table, rownames = rownames, extensions = 'Buttons',
options = list(scrollX = TRUE,
dom = '<<t>Bp>',
buttons = c('copy', 'excel', 'pdf', 'print')), ...)
num_cols <- c(as.numeric(which(sapply(table, class) == "numeric")))
if(length(num_cols) > 0){
formatSignif(df, columns = num_cols, digits = digits)
} else{
df
}
}
The data in CleanData_final_01 will be used after the clean and quality process. This data set was provided by Scott Bates and it consists of data with genotype, market classes, locations, years, repetitions, yield, and maturity. The data came from 4 different locations: Bay (BA), Tuscola (TU), Sanilac (SA), and Huron (HU). The following table shows the principal data structure from this project, in which BB is Black Beans, NA = Navy Beans, and SR = Small Red.
For the purpose of this study, the Loc effect will be considered as the trial effect (Env). In this case, 4 environments will be considered. This aim is important to define the environment recommendations according to the genotype performance.
General data distribution number for each factor from this analysis
| Mkt class | Genotype | Rep. | Loc | Year |
|---|---|---|---|---|
| BB | 29 | 4 | 4 | 5 |
| NB | 35 | 4 | 4 | 5 |
| SR | 11 | 4 | 4 | 5 |
| TOTAL | 75 | 3 | 4 | 5 |
The data in CleanData_final_01 will be used after the clean and quality process has been performed. Reading all Market classes together and individually
data_beans = read.csv("CleanData_final_01.csv",h=T, stringsAsFactors = T)
data_beans <- data_beans %>%
unite("rep_loc",c("rep","loc"), remove = F, na.rm = F)
data_beans %>%
as_tibble()
#> # A tibble: 6,000 x 9
#> name rep_loc rep year loc year_loc expt yield_kg_ha_adj_all DPM
#> <fct> <chr> <int> <int> <fct> <fct> <fct> <dbl> <int>
#> 1 13505 1_BA 1 17 BA 17_BA BB 2491. NA
#> 2 13505 1_HU 1 17 HU 17_HU BB 4248. NA
#> 3 13505 1_SA 1 17 SA 17_SA BB 1848. NA
#> 4 13505 1_TU 1 17 TU 17_TU BB 2461. NA
#> 5 13505 1_BA 1 18 BA 18_BA BB 2864. NA
#> 6 13505 1_HU 1 18 HU 18_HU BB 2965. NA
#> 7 13505 1_SA 1 18 SA 18_SA BB 4103. NA
#> 8 13505 1_TU 1 18 TU 18_TU BB 1753. NA
#> 9 13505 1_BA 1 19 BA 19_BA BB 2524. NA
#> 10 13505 1_HU 1 19 HU 19_HU BB 3116. NA
#> # ... with 5,990 more rows
### Data adjustment
#All the effect columns must be as a factor to run in ASReml-r.\n
#Removing NA's values for the `yield_kg_ha_adj_all` column (trait) in order to avoid any mistake.
cols <- c("rep", "rep_loc", "name", "loc","year", "expt", "year_loc")
data_beans[cols] <- lapply(data_beans[cols], factor)
data_beans <- data.table(data_beans)
data_beans <- na.omit(data_beans,cols = "yield_kg_ha_adj_all")
#str(data_beans)
rep and locWe will correct the genetic values by year within loc and rep in order to investigate the residuals values. The following pepiline will be used: 1-Stage = BLUEs estimation for the vector of the variable yield in the ith genotype, and jth year within loc and rep. Then the BLUEs from the 1-Stage (Yik) will be used to predict the BLUPs (Yijl) of the ith genotype in the lth location and jth rep in the 2-Stage in which this second model have name and loc effects as random.
The LSMEANS will be estimated using a mixed-effect model. The BLUEs will be obtained by location and rep using a loop with (asreml?) and storage into the list. The following mixed model was used to estimate the BLUEs of each genotype within location and rep with one value per genotype per experiment, for the first step (random effects are underlined in all equations):
\[ {\sf\underline{Y_{ik}} = \mu + {G_i} + \underline{S_{ik}} + \underline{\varepsilon_{ik}} } \]
where \(y_{ik}\) is the observed yield in the ith genotype and kth year, \(\mu\) is the overall mean, \(G_i\) is the effect of the ith genotype, \(S_{k}\) is the effect of the kth year, and \(\varepsilon_{ik}\) are the residual, with \(S_{k}\)~N(0,\(\sigma_{Y}^{2}\)), and \(\varepsilon_{ik}\)~N(0,\(\sigma_{\varepsilon}^{2}\)), all independent, where \(\sigma_{Y}^{2}\) is the year variance, and \(\sigma_{\varepsilon}^{2}\) is the mean error variance across experiments.
The BLUEs will be obtained by location and rep (rep_loc) using a loop with (asreml?) and storage it into the stgI_list list. The output here will be a table with name, yield_LSM, se columns.
## Analysis per site
Envs <- levels(data_beans$rep_loc)
stgI_list <- matrix(data=list(), nrow=length(Envs), ncol=1,
dimnames=list(Envs, c("lsmeans")))
############################
##### Stage I LSMEANS #####
for (i in Envs){
Edat <- droplevels(subset(data_beans, rep_loc==i))
print(i)
mod.1 <- asreml(fixed = yield_kg_ha_adj_all ~ name ,
random = ~ year ,
data = Edat,
#predict = predict.asreml(classify = "name"),
trace = F,
maxit = 500)
print(summary.asreml(mod.1)$varcomp)
wald(mod.1)
blue<- predict(mod.1, classify="name", levels=levels(Edat$name), vcov=TRUE, aliased = T)
blue.1 <- data.table(blue$pvals)[, c(1:3)]
names(blue.1) <- c("name", "yield_LSM", "se")
stgI_list[[i, "lsmeans" ]] <- blue.1 # put all the results of Stage 1 in the list
rm(Edat,mod.1, blue, blue.1)
}
#> [1] "1_BA"
#> Online License checked out Wed Jan 12 14:47:59 2022
#> component std.error z.ratio bound %ch
#> year 193028 143047 1.3494 P 0.6
#> units!R 439401 48091 9.1369 P 0.0
#> [1] "1_HU"
#> component std.error z.ratio bound %ch
#> year 31502 28870 1.0912 P 0
#> units!R 408607 43925 9.3023 P 0
#> [1] "1_SA"
#> component std.error z.ratio bound %ch
#> year 222709 164055 1.3575 P 0.7
#> units!R 493534 53069 9.2998 P 0.0
#> [1] "1_TU"
#> component std.error z.ratio bound %ch
#> year 110836 86554 1.2805 P 0
#> units!R 452231 48355 9.3523 P 0
#> [1] "2_BA"
#> component std.error z.ratio bound %ch
#> year 233970 174851 1.3381 P 0.3
#> units!R 595421 65161 9.1377 P 0.0
#> [1] "2_HU"
#> component std.error z.ratio bound %ch
#> year 79136 65363 1.2107 P 0.1
#> units!R 563263 61452 9.1659 P 0.0
#> [1] "2_SA"
#> component std.error z.ratio bound %ch
#> year 425143 308886 1.3764 P 0.4
#> units!R 605598 64743 9.3538 P 0.0
#> [1] "2_TU"
#> component std.error z.ratio bound %ch
#> year 49442 41905 1.1798 P 0.5
#> units!R 438513 46748 9.3804 P 0.0
#> [1] "3_BA"
#> component std.error z.ratio bound %ch
#> year 481671 348799 1.3809 P 0
#> units!R 486283 53216 9.1379 P 0
#> [1] "3_HU"
#> component std.error z.ratio bound %ch
#> year 120096 93305 1.2871 P 0.6
#> units!R 542137 58463 9.2731 P 0.0
#> [1] "3_SA"
#> component std.error z.ratio bound %ch
#> year 342219 247609 1.3821 P 0
#> units!R 345699 37170 9.3006 P 0
#> [1] "3_TU"
#> component std.error z.ratio bound %ch
#> year 228289 166442 1.3716 P 0.4
#> units!R 329847 35367 9.3265 P 0.0
#> [1] "4_BA"
#> component std.error z.ratio bound %ch
#> year 204544 157604 1.2978 P 0.3
#> units!R 536146 67562 7.9356 P 0.0
#> [1] "4_HU"
#> component std.error z.ratio bound %ch
#> year 50052 45510 1.0998 P 0.1
#> units!R 485176 61317 7.9126 P 0.0
#> [1] "4_SA"
#> component std.error z.ratio bound %ch
#> year 285370 211436 1.3497 P 1
#> units!R 541645 67454 8.0299 P 0
#> [1] "4_TU"
#> component std.error z.ratio bound %ch
#> year 117634 96827 1.2149 P 0.7
#> units!R 541394 68525 7.9006 P 0.0
Merging the original data to have all the factors in the final table with: name, loc, expt, rep, rep_loc
##### Unlist the results of Stage I and format as data.table #####
lsm_stageI <- data.table(ldply(stgI_list[, "lsmeans"], data.frame, .id="rep_loc"))
lsm_stageI <- lsm_stageI[order(lsm_stageI$rep_loc,lsm_stageI$name),]
lsm_stageI$units <- seq.int(nrow(lsm_stageI))
#str(lsm_stageI)
data_beans_col <- data_beans[,c("name", "loc", "expt","rep", "rep_loc")]
data_beans_col <- data_beans_col[order(data_beans_col$rep_loc,data_beans_col$name),]
lsm_stage.I <- merge(data_beans_col, lsm_stageI,
by=c("name","rep_loc"))
lsm_stage.I <- droplevels(lsm_stage.I[!duplicated(lsm_stage.I$units),])
str(lsm_stage.I)
#> Classes 'data.table' and 'data.frame': 1198 obs. of 8 variables:
#> $ name : Factor w/ 75 levels "12039","12062",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ rep_loc : Factor w/ 16 levels "1_BA","1_HU",..: 1 2 3 4 5 6 7 8 9 10 ...
#> $ loc : Factor w/ 4 levels "BA","HU","SA",..: 1 2 3 4 1 2 3 4 1 2 ...
#> $ expt : Factor w/ 3 levels "BB","NB","SR": 2 2 2 2 2 2 2 2 2 2 ...
#> $ rep : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
#> $ yield_LSM: num 3089 3481 3425 3713 3108 ...
#> $ se : num 356 297 378 336 407 ...
#> $ units : int 1 76 151 226 301 376 451 526 601 676 ...
#> - attr(*, ".internal.selfref")=<externalptr>
#> - attr(*, "sorted")= chr [1:2] "name" "rep_loc"
#write.csv(lsm_stage.I,"lsm_stage.I.csv",row.names=T,quote=F)
The following linear mixed model with interaction effect will be used in the 2-Stage in order to investigate the multi-environment trials (MET) as follow by (metan?) R package:
\[ {\sf\underline{Y_{ijl}} = \mu + \underline{G_i} + E_l + \beta_{jl} + \underline{GE_{il}} + \underline{\varepsilon_{ijl}} } \]
where \({y_{ijl}}\) is the response variable (e.g., grain yield) observed in the jth repetion of the ith genotype in the lth location (i = 1, 2, …, g; j = 1, 2, …, b; l = 1, 2, …, e); \(\mu\) is the grand mean; \(\underline{G_i}\) is the effect of the ith genotype; \(E_l\) is the effect of the lth location (env); \(\beta_{jl}\) is the effect of the jth rep with the lth location; \(\underline{GE_{il}}\) is the interaction effect of the ith genotype nested within the lth location; and \(\mathop \varepsilon \nolimits_{ijl}\) is the random error, in witch with \(G_{i}\)~N(0,\(\sigma_{G}^{2}\)), \(GE_{il}\)~N(0,\(\sigma_{GE}^{2}\)), and \(\varepsilon_{ijl}\)~N(0,\(\sigma_{\varepsilon}^{2}\)), all independent, where \(G_{G}^{2}\) is the genotype (name) variance,\(GS_{GE}^{2}\) is the interaction genotype x environment variance, and \(\sigma_{\varepsilon}^{2}\) is the mean error variance across experiments.
mod.met.4mod.met.4 <- asreml(fixed = yield_LSM ~ loc + loc:rep,
random = ~ name + name:loc,
data = lsm_stage.I,
predict = predict.asreml(classify = "name"),
trace = F,
maxit = 500)
print(wald(mod.met.4))
#> [0;34mWald tests for fixed effects.[0m
#> [0;34mResponse: yield_LSM[0m
#> [0;34mTerms added sequentially; adjusted for those above.[0m
#>
#> Df Sum of Sq Wald statistic Pr(Chisq)
#> (Intercept) 1 1166993362 9443 < 0.0000000000000002 ***
#> loc 3 49221386 398 < 0.0000000000000002 ***
#> loc:rep 12 7937378 64 0.0000000038 ***
#> residual (MS) 123584
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(summary.asreml(mod.met.4)$varcomp)
#> component std.error z.ratio bound %ch
#> name 61933 12865.7 4.8138 P 0
#> name:loc 32118 6163.7 5.2108 P 0
#> units!R 123584 5872.2 21.0456 P 0
print(summary.asreml(mod.met.4)$bic)
#> [1] 15460
#> attr(,"parameters")
#> [1] 3
blup.met.4<- data.table((mod.met.4$predictions$pvals[1:3]))
names(blup.met.4) <- c("name", "yield_LSM_MET", "se")
mod.met.metan.a Metan MET analysis mixed_mod.a <-
gamem_met(lsm_stage.I,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:01
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.0000000000000000000388
#> GEN:ENV 0.0000000000007931366015
#> ---------------------------------------------------------------------------
#> All variables with significant (p < 0.05) genotype-vs-environment interaction
blup.metan.a<- mixed_mod.a$yield_LSM$BLUPgen
blup.metan.int.a<- mixed_mod.a$yield_LSM$BLUPint
blup.metan.means.a<- means_by(blup.metan.int.a,GEN,ENV)
locWe will correct the genetic values by year within loc in order to investigate the residuals values. The following pepiline will be used: 1-Stage = BLUEs estimation for the vector of the variable yield in the ith genotype and jth block, and jth year within loc. Then the BLUEs from the 1-Stage (Yik) will be used to predict the BLUPs (Yijl) of the ith genotype in the lth location and jth rep in the 2-Stage in which this second model have name and loc effects as random.
The LSMEANS will be estimated using a mixed-effect model. The BLUEs will be obtained by location and rep using a loop with (asreml?) and storage into the list.
The following mixed model was used to estimate the BLUEs of each genotype within location and rep with one value per genotype per experiment, for the first step (random effects are underlined in all equations):
\[ {\sf\underline{Y_{ijk}} = \mu + {G_i} + \underline{S}_k + \beta_{jk} + \underline{GS_{ik}} + \underline{\varepsilon_{ijl}} } \]
where \({y_{ijk}}\) is the response variable (e.g., grain yield) observed in the jth block of the ith genotype in the kth year (i = 1, 2, …, g; j = 1, 2, …, b; k = 1, 2, …, y); \(\mu\) is the grand mean; \({G_i}\) is the effect of the ith genotype; \(\underline{S}_l\) is the effect of the kth year; \(\beta_{jk}\) is the effect of the jth replication (i.e., blocks) with the kth year; \(\underline{GS_{ik}}\) is the interaction effect of the kth year nested within the ith genotype; and \(\mathop \varepsilon \nolimits_{ijk}\) is the random error, in witch with \(S_{k}\)~N(0,\(\sigma_{Y}^{2}\)), \(GS_{ik}\)~N(0,\(\sigma_{GS}^{2}\)), and \(\varepsilon_{ijk}\)~N(0,\(\sigma_{\varepsilon}^{2}\)), all independent, where \(\sigma_{Y}^{2}\) is the genotype (name) variance,\(\sigma_{GS}^{2}\)is the interaction genotype x year variance, and \(\sigma_{\varepsilon}^{2}\) is the mean error variance across experiments.
The BLUEs will be obtained by location and rep (rep_loc) using a loop with (asreml?) and storage it into the stgI_list list. The output here will be a table with name, yield_LSM, se columns.
## Analysis per site
Envs <- levels(data_beans$loc)
stgI_list <- matrix(data=list(), nrow=length(Envs), ncol=1,
dimnames=list(Envs, c("lsmeans")))
############################
##### Stage I LSMEANS #####
for (i in Envs){
Edat <- droplevels(subset(data_beans, loc==i))
print(i)
mod.1 <- asreml(fixed = yield_kg_ha_adj_all ~ name:rep + name,
random = ~ year + name:year ,
data = Edat,
predict = predict.asreml(classify = "name:rep",vcov=TRUE, aliased = T),
trace = F,
maxit = 500)
print(summary.asreml(mod.1)$varcomp)
wald(mod.1)
blue.1<- data.table((mod.1$predictions$pvals[1:4]))
names(blue.1) <- c("name", "rep", "yield_LSM", "se")
stgI_list[[i, "lsmeans" ]] <- blue.1 # put all the results of Stage 1 in the list
rm(Edat,mod.1, blue.1)
}
#> [1] "BA"
#> component std.error z.ratio bound %ch
#> year 269851 195915 1.3774 P 0
#> name:year 209901 32895 6.3808 P 0
#> units!R 324786 21139 15.3641 P 0
#> [1] "HU"
#> component std.error z.ratio bound %ch
#> year 43221 34511 1.2524 P 0
#> name:year 167866 28970 5.7945 P 0
#> units!R 353615 22859 15.4695 P 0
#> [1] "SA"
#> component std.error z.ratio bound %ch
#> year 310766 224596 1.3837 P 0
#> name:year 215994 31751 6.8028 P 0
#> units!R 287814 18426 15.6204 P 0
#> [1] "TU"
#> component std.error z.ratio bound %ch
#> year 113408 83869 1.3522 P 0
#> name:year 111051 22627 4.9079 P 0
#> units!R 338078 21743 15.5490 P 0
Merging the original data to have all the factors in the final table with: name, loc, expt, rep, loc
##### Unlist the results of Stage I and format as data.table #####
lsm_stageI <- data.table(ldply(stgI_list[, "lsmeans"], data.frame, .id="loc"))
lsm_stageI <- lsm_stageI[order(lsm_stageI$loc,lsm_stageI$name),]
lsm_stageI$units <- seq.int(nrow(lsm_stageI))
#str(lsm_stageI)
data_beans_col <- data_beans[,c("name", "loc", "expt","rep")]
data_beans_col <- data_beans_col[order(data_beans_col$loc,data_beans_col$name),]
lsm_stage.I <- merge(data_beans_col, lsm_stageI,
by=c("name", "rep", "loc"))
lsm_stage.I <- droplevels(lsm_stage.I[!duplicated(lsm_stage.I$units),])
str(lsm_stage.I)
#> Classes 'data.table' and 'data.frame': 1198 obs. of 7 variables:
#> $ name : Factor w/ 75 levels "12039","12062",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ rep : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
#> $ loc : Factor w/ 4 levels "BA","HU","SA",..: 1 2 3 4 1 2 3 4 1 2 ...
#> $ expt : Factor w/ 3 levels "BB","NB","SR": 2 2 2 2 2 2 2 2 2 2 ...
#> $ yield_LSM: num 3089 3481 3425 3713 3108 ...
#> $ se : num 401 336 404 335 401 ...
#> $ units : int 1 301 601 901 2 302 602 902 3 303 ...
#> - attr(*, ".internal.selfref")=<externalptr>
#> - attr(*, "sorted")= chr [1:3] "name" "rep" "loc"
#write.csv(lsm_stage.I,"lsm_stage.I.csv",row.names=T,quote=F)
The following linear mixed model with interaction effect will be used in the 2-Stage in order to investigate the multi-environment trials (MET) as follow by (metan?) R package:
\[ {\sf\underline{Y_{ijl}} = \mu + \underline{G_i} + E_l + \beta_{jl} + \underline{GE_{il}} + \underline{\varepsilon_{ijl}} } \]
where \({y_{ijl}}\) is the response variable (e.g., grain yield) observed in the jth block of the ith genotype in the lth location (i = 1, 2, …, g; j = 1, 2, …, b; l = 1, 2, …, e); \(\mu\) is the grand mean; \(\underline{G_i}\) is the effect of the ith genotype; \(E_l\) is the effect of the lth location (env); \(\beta_{jl}\) is the effect of the jth rep with the lth location; \(\underline{GE_{il}}\) is the interaction effect of the ith genotype nested within the lth location; and \(\mathop \varepsilon \nolimits_{ijl}\) is the random error, in witch with \(G_{i}\)~N(0,\(\sigma_{G}^{2}\)), \(GE_{il}\)~N(0,\(\sigma_{GE}^{2}\)), and \(\varepsilon_{ijl}\)~N(0,\(\sigma_{\varepsilon}^{2}\)), all independent, where \(G_{G}^{2}\) is the genotype (name) variance,\(GS_{GE}^{2}\) is the interaction genotype x environment variance, and \(\sigma_{\varepsilon}^{2}\) is the mean error variance across experiments.
mod.met.5mod.met.5 <- asreml(fixed = yield_LSM ~ loc + loc:rep,
random = ~ name + name:loc,
data = lsm_stage.I,
predict = predict.asreml(classify = "name"),
trace = F,
maxit = 500)
print(wald(mod.met.5))
#> [0;34mWald tests for fixed effects.[0m
#> [0;34mResponse: yield_LSM[0m
#> [0;34mTerms added sequentially; adjusted for those above.[0m
#>
#> Df Sum of Sq Wald statistic Pr(Chisq)
#> (Intercept) 1 1173868043 9496 < 0.0000000000000002 ***
#> loc 3 50795466 411 < 0.0000000000000002 ***
#> loc:rep 12 5972234 48 0.0000028 ***
#> residual (MS) 123616
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(summary.asreml(mod.met.5)$varcomp)
#> component std.error z.ratio bound %ch
#> name 61312 12774.4 4.7996 P 0
#> name:loc 32326 6183.4 5.2279 P 0
#> units!R 123616 5873.6 21.0459 P 0
print(summary.asreml(mod.met.5)$bic)
#> [1] 15460
#> attr(,"parameters")
#> [1] 3
blup.met.5<- data.table((mod.met.5$predictions$pvals[1:3]))
names(blup.met.5) <- c("name", "yield_LSM_MET", "se")
mod.met.metan.b Metan MET analysis mixed_mod.b <-
gamem_met(lsm_stage.I,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:01
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.0000000000000000000711
#> GEN:ENV 0.0000000000006136671139
#> ---------------------------------------------------------------------------
#> All variables with significant (p < 0.05) genotype-vs-environment interaction
blup.metan.b<- mixed_mod.b$yield_LSM$BLUPgen
blup.metan.int.b<- mixed_mod.b$yield_LSM$BLUPint
blup.metan.means.b<- means_by(blup.metan.int.b,GEN,ENV)
data_merge_blups <- merge(blup.met.4,blup.met.5, by = "name" )
corr.blup <- data_merge_blups %$%
cor.test(yield_LSM_MET.x, yield_LSM_MET.y) %>%
tidy %>%
mutate_if(is.numeric, round, 4)
text <- paste0('r P:BLUPs = ', corr.blup$estimate)
#annotate('text', x = 400, y = mean(pheno.add$Blue)+400, label=text))
#cor.test(blup.gen$yield_LSM_MET,blup.met.4$yield_LSM_MET)
plot(blup.met.4$yield_LSM_MET,blup.met.5$yield_LSM_MET, ylab = "Blup Yield - mod.met.4", xlab = "Blup Yield - mod.met.5")
abline(0,1, col = "gray", lty = 1)
abline(lm(blup.met.4$yield_LSM_MET~ blup.met.5$yield_LSM_MET), col = "blue", lty = 2)
temp <- legend("bottomright", legend = c(" ", " "),
text.width = strwidth("1,000,000"),
lty = 1:2, xjust = 1, yjust = 1,
title = "Line Types")
text(temp$rect$left + temp$rect$w, temp$text$y,
c("1:1 line", "Fit line"), pos = 2)
text(x=2850, y=3500, labels=text)
mod.met.metan.b Metan MET analysis mixed_mod<-
gamem_met(lsm_stage.I,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:01
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.0000000000000000000711
#> GEN:ENV 0.0000000000006136671139
#> ---------------------------------------------------------------------------
#> All variables with significant (p < 0.05) genotype-vs-environment interaction
blup.metan.b<- mixed_mod.b$yield_LSM$BLUPgen
blup.metan.int.b<- mixed_mod.b$yield_LSM$BLUPint
blup.metan.means.b<- means_by(blup.metan.int.b,GEN,ENV)
The S3 generic function plot() is used to generate diagnostic plots of residuals of the model.
plot(mixed_mod)
The normality of the random effects of genotype and interaction effects may be also obtained by using type = "re".
plot(mixed_mod, type = "re")
The output LRT contains the Likelihood Ratio Tests for genotype and genotype-vs-environment random effects. We can get these values with get_model_data()
data <- get_model_data(mixed_mod, "lrt")
print_table(data)
In the output ESTIMATES, beyond the variance components for the declared random effects, some important parameters are also shown. Heribatility is the broad-sense heritability, \(\mathop h\nolimits_g^2\), estimated by \[
\mathop h\nolimits_g^2 = \frac{\mathop {\hat\sigma} \nolimits_g^2} {\mathop {\hat\sigma} \nolimits_g^2 + \mathop {\hat\sigma} \nolimits_i^2 + \mathop {\hat\sigma} \nolimits_e^2 }
\]
where \(\mathop {\hat\sigma} \nolimits_g^2\) is the genotypic variance; \(\mathop {\hat\sigma} \nolimits_i^2\) is the genotype-by-environment interaction variance; and \(\mathop {\hat\sigma} \nolimits_e^2\) is the residual variance.
GEIr2 is the coefficient of determination of the interaction effects, \(\mathop r\nolimits_i^2\), estimated by
\[ \mathop r\nolimits_i^2 = \frac{\mathop {\hat\sigma} \nolimits_i^2} {\mathop {\hat\sigma} \nolimits_g^2 + \mathop {\hat\sigma} \nolimits_i^2 + \mathop {\hat\sigma} \nolimits_e^2 } \] Heribatility of means is the heribability on the mean basis, \(\mathop h\nolimits_{gm}^2\), estimated by
\[ \mathop h\nolimits_{gm}^2 = \frac{\mathop {\hat\sigma} \nolimits_g^2}{[\mathop {\hat\sigma} \nolimits_g^2 + \mathop {\hat\sigma} \nolimits_i^2 /e + \mathop {\hat\sigma} \nolimits_e^2 /\left( {eb} \right)]} \]
where e and b are the number of environments and blocks, respectively; Accuracy is the accuracy of selection, Ac, estimated by \[ Ac = \sqrt{\mathop h\nolimits_{gm}^2} \]
rge is the genotype-environment correlation, \(\mathop r\nolimits_{ge}\), estimated by
\[ \mathop r\nolimits_{ge} = \frac{\mathop {\hat\sigma} \nolimits_g^2}{\mathop {\hat\sigma} \nolimits_g^2 + \mathop {\hat\sigma} \nolimits_i^2} \]
CVg and CVr are the the genotypic coefficient of variation and the residual coefficient of variation estimated, respectively, by \[ CVg = \left( {\sqrt {\mathop {\hat \sigma }\nolimits_g^2 } /\mu } \right) \times 100 \] and \[ CVr = \left( {\sqrt {\mathop {\hat \sigma }\nolimits_e^2 } /\mu } \right) \times 100 \] where \(\mu\) is the grand mean.
CV ratio is the ratio between genotypic and residual coefficient of variation.
data <- get_model_data(mixed_mod)
print_table(data)
print_table(mixed_mod$yield_LSM $BLUPgen)
The function get_model_data() may be used to easily get the data from a model fitted with the function gamem_met(), especially when more than one variables are used. The following code return the predicted mean of each genotype for five variables of the data data_ge2.
get_model_data(mixed_mod, what = "blupg")
#> # A tibble: 75 x 2
#> GEN yield_LSM
#> <fct> <dbl>
#> 1 12039 3264.
#> 2 12062 3372.
#> 3 12063 3308.
#> 4 12064 2833.
#> 5 13505 3148.
#> 6 13SR1-1 2949.
#> 7 14068 3173.
#> 8 14078 3089.
#> 9 14080 3001.
#> 10 14084 3185.
#> # ... with 65 more rows
library(ggplot2)
a <- plot_blup(mixed_mod)
b <- plot_blup(mixed_mod,
col.shape = c("gray20", "gray80"),
plot_theme = theme_metan(grid = "y")) +
coord_flip()
arrange_ggplot(a, b, tag_levels = "a")
This output shows the predicted means for genotypes. BLUPg is the genotypic effect \((\hat{g}_{i})\), which considering balanced data and genotype as random effect is estimated by
\[ \hat{g}_{i} = h_g^2(\bar{y}_{i.}-\bar{y}_{..}) \]
where \(h_g^2\) is the shrinkage effect for genotype. Predicted is the predicted mean estimated by \[ \hat{g}_{i}+\mu \]
where \(\mu\) is the grand mean. LL and UL are the lower and upper limits, respectively, estimated by \[ (\hat{g}_{i}+\mu)\pm{CI} \] with \[ CI = t\times\sqrt{((1-Ac)\times{\mathop \sigma \nolimits_g^2)}} \]
where \(t\) is the Student’s t value for a two-tailed t test at a given probability error; \(Ac\) is the accuracy of selection and \(\mathop \sigma \nolimits_g^2\) is the genotypic variance.
print_table(mixed_mod$yield_LSM$BLUPint)
This output shows the predicted means for each genotype and environment combination. BLUPg is the genotypic effect described above. BLUPge is the genotypic effect of the ith genotype in the jth environment \((\hat{g}_{ij})\), which considering balanced data and genotype as random effect is estimated by \[\hat{g}_{ij} = h_g^2(\bar{y}_{i.}-\bar{y}_{..})+h_{ge}^2(y_{ij}-\bar{y}_{i.}-\bar{y}_{.j}+\bar{y}_{..})\] where \(h_{ge}^2\) is the shrinkage effect for the genotype-by-environment interaction; BLUPg+ge is \(BLUP_g+BLUP_{ge}\); Predicted is the predicted mean (\(\hat{y}_{ij}\)) estimated by \[ \hat{y}_{ij} = \bar{y}_{.j}+BLUP_{g+ge} \]
The following pieces of information are provided in Details output. Nenv, the number of environments in the analysis; Ngen the number of genotypes in the analysis; mresp The value attributed to the highest value of the response variable after rescaling it; wresp The weight of the response variable for estimating the WAASBY index. Mean the grand mean; SE the standard error of the mean; SD the standard deviation. CV the coefficient of variation of the phenotypic means, estimating WAASB, Min the minimum value observed (returning the genotype and environment), Max the maximum value observed (returning the genotype and environment); MinENV the environment with the lower mean, MaxENV the environment with the larger mean observed, MinGEN the genotype with the lower mean, MaxGEN the genotype with the larger.
data <- get_model_data(mixed_mod, "details")
print_table(data)
The function waasb() function computes the Weighted Average of the Absolute Scores considering all possible IPCA from the Singular Value Decomposition of the BLUPs for genotype-vs-environment interaction effects obtained by an Linear Mixed-effect Model (Olivoto2019?), as follows:
\[ WAASB_i = \sum_{k = 1}^{p} |IPCA_{ik} \times EP_k|/ \sum_{k = 1}^{p}EP_k \]
where \(WAASB_i\) is the weighted average of absolute scores of the ith genotype; \(IPCA_{ik}\) is the scores of the ith genotype in the kth IPCA; and \(EP_k\) is the explained variance of the kth PCA for \(k = 1,2,..,p\), \(p = min(g-1; e-1)\).
waasb_model <-
waasb(lsm_stage.I,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:01
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.0000000000000000000711
#> GEN:ENV 0.0000000000006136671139
#> ---------------------------------------------------------------------------
#> All variables with significant (p < 0.05) genotype-vs-environment interaction
data <- waasb_model$yield_LSM$model
print_table(data)
The output generated by the waasb() function is very similar to those generated by the waas() function. The main difference here, is that the singular value decomposition is based on the BLUP for GEI effects matrix.
data <- waasb_model$yield_LSM$PCA
print_table(data)
plot_eigen(waasb_model, size.lab = 14, size.tex.lab = 14)
The above output shows the eigenvalues and the proportion of variance explained by each principal component axis of the BLUP interaction effects matrix.
data <- waasb_model$yield_LSM$MeansGxE
print_table(data)
In this output, Y is the phenotypic mean for each genotype and environment combination (\(y_{ij}\)), estimated by \(y_{ij} = \sum_k{y_{ij}}/B\) with \(k = 1,2,...B\).
Provided that an object of class waasb is available in the global environment, the graphics may be obtained using the function plot_scores(). To do that, we will revisit the previusly fitted model WAASB . Please, refer to ?plot_scores for more details. Four types of graphics can be generated: 1 = \(PC1 \times PC2\); 2 = \(GY \times PC1\); 3 = \(GY \times WAASB\); and 4 = a graphic with nominal yield as a function of the environment PCA1 scores.
c <- plot_scores(waasb_model, type = 1)
d <- plot_scores(waasb_model,
type = 1,
col.gen = "black",
col.env = "red",
col.segm.env = "red",
axis.expand = 1.5)
arrange_ggplot(c, d, tag_levels = list(c("c", "d")))
e <- plot_scores(waasb_model, type = 2)
f <- plot_scores(waasb_model,
type = 2,
polygon = TRUE,
col.segm.env = "transparent",
plot_theme = theme_metan_minimal())
arrange_ggplot(e, f, tag_levels = list(c("e", "f")))
The quadrants proposed by (Olivoto2019?) in the following biplot represent four classifications regarding the joint interpretation of mean performance and stability. The genotypes or environments included in quadrant I can be considered unstable genotypes or environments with high discrimination ability, and with productivity below the grand mean. In quadrant II are included unstable genotypes, although with productivity above the grand mean. The environments included in this quadrant deserve special attention since, in addition to providing high magnitudes of the response variable, they present a good discrimination ability. Genotypes within quadrant III have low productivity, but can be considered stable due to the lower values of WAASB. The lower this value, the more stable the genotype can be considered. The environments included in this quadrant can be considered as poorly productive and with low discrimination ability. The genotypes within the quadrant IV are highly productive and broadly adapted due to the high magnitude of the response variable and high stability performance (lower values of WAASB).
g <- plot_scores(waasb_model, type = 3)
h <- plot_scores(waasb_model, type = 3,
x.lab = "My customized x label",
size.shape.gen = 3,
size.tex.gen = 2,
#x.lim = c(1.2, 4.7),
#x.breaks = seq(1.5, 4.5, by = 0.5),
plot_theme = theme_metan(color.background = "white"))
arrange_ggplot(g, h, tag_levels = list(c("g", "h")))
i <- plot_scores(waasb_model, type = 4)
j <- plot_scores(waasb_model,
type = 4,
size.tex.gen = 1.5,
color = FALSE,
col.alpha.gen = 0,
col.alpha.env = 0,
plot_theme = theme_metan(color.background = "white"))
arrange_ggplot(i, j, tag_levels = list(c("i", "j")))
The waasby index is used for genotype ranking considering both the stability (waasb) and mean performance (y) based on the following model (Olivoto2019?).
\[ waasby_i = \frac{{\left( {r {Y_i} \times {\theta _Y}} \right) + \left( {r {W_i} \times {\theta _W}} \right)}}{{{\theta _Y} + {\theta _W}}} \]
where \(waasby_i\) is the superiority index for the i-th genotype; \(rY_i\) and \(rW_i\) are the rescaled values (0-100) for the response variable (y) and the stability (WAAS or WAASB), respectively; \(\theta _Y\) and \(\theta_W\) are the weights for mean performance and stability, respectively.
This index was also already computed and stored into AMMI_model>GY>model. An intuitively plot may be obtained by running
i <- plot_waasby(waasb_model)
j <- plot_waasby(waasb_model, col.shape = c("gray20", "gray80"))
arrange_ggplot(i, j, tag_levels = list(c("e", "f")))
In the following example, we will apply the function wsmp() to the previously fitted model waasb_model aiming at planning different scenarios of waasby estimation by changing the weights assigned to the stability and the mean performance. The number of scenarios is defined by the arguments increment. By default, twenty-one different scenarios are computed. In this case, the the superiority index waasby is computed considering the following weights: stability (waasb or waas) = 100; mean performance = 0. In other words, only stability is considered for genotype ranking. In the next iteration, the weights becomes 95/5 (since increment = 5). In the third scenario, the weights become 90/10, and so on up to these weights become 0/100. In the last iteration, the genotype ranking for WAASY or WAASBY matches perfectly with the ranks of the response variable.
scenarios <- wsmp(waasb_model)
#> Ranks considering 0 for GY and 100 for WAASB | | 2% 00:00:00
Ranks considering 0 for GY and 100 for WAASB |= | 3% 00:00:00
Ranks considering 0 for GY and 100 for WAASB |= | 5% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |== | 6% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |== | 8% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |=== | 10% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |=== | 11% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |==== | 13% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |==== | 14% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |===== | 16% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |===== | 17% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |====== | 19% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |====== | 21% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |====== | 22% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |======= | 24% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======= | 25% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======== | 27% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======== | 29% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========= | 30% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========= | 32% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========== | 33% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |========== | 35% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |=========== | 37% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |=========== | 38% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 40% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 41% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 43% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============= | 44% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============= | 46% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============== | 48% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |============== | 49% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |=============== | 51% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |=============== | 52% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================ | 54% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================ | 56% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================= | 57% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================= | 59% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================= | 60% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================== | 62% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |================== | 63% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |=================== | 65% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |=================== | 67% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |==================== | 68% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |==================== | 70% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |===================== | 71% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |===================== | 73% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |====================== | 75% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |====================== | 76% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 78% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 79% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 81% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |======================== | 83% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |======================== | 84% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |========================= | 86% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================= | 87% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================== | 89% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================== | 90% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================ | 92% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================ | 94% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================= | 95% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |============================ | 97% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |=============================| 98% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |=============================| 100% 00:00:00
print_table(scenarios$yield_LSM$hetcomb)
In addition, the genotype ranking depending on the number of multiplicative terms used to estimate the WAAS index is also computed.
print_table(scenarios$yield_LSM$hetdata)
The first type of heatmap shows the genotype ranking depending on the number of principal component axes used for estimating the WAASB index. An euclidean distance-based dendrogram is used for grouping the genotypes based on their ranks. The second type of heatmap shows the genotype ranking depending on the WAASB/GY ratio. The ranks obtained with a ratio of 100/0 considers exclusively the stability for genotype ranking. On the other hand, a ratio of 0/100 considers exclusively the productivity for genotype ranking. Four clusters are estimated (1) unproductive and unstable genotypes; (2) productive, but unstable genotypes; (3) stable, but unproductive genotypes; and (4), productive and stable genotypes (Olivoto2019?).
plot(scenarios, type = 1)
plot(scenarios, type = 2)
(ColombariFilho2013?) have shown the use of three BLUP-based indexes for selecting genotypes with performance and stability. The first is the harmonic mean of genotypic values -or BLUPS- (HMGV) a stability index that considers the genotype with the highest harmonic mean across environments as the most stable, as follows:
\[ HMG{V_i} = \frac{E}{{\sum\limits_{j = 1}^E {\frac{1}{{BLUP{_{ij}}}}}}} \]
The second is the relative performance of genotypic values (RPGV), an adaptability index estimated as follows:
\[ RPGV_i = \frac{1}{e}{\sum\limits_{j = 1}^e {BLUP_{ij}} /\mathop \mu \nolimits_j } \]
The third and last is the harmonic mean of relative performance of genotypic values (HMRPGV), a simultaneous selection index for stability, adaptability and mean performance, estimated as follows:
\[ HMRPG{V_i} = \frac{E}{{\sum\limits_{j = 1}^E {\frac{1}{{BLUP{_{ij}}/{\mu _j}}}} }} \]
Res_ind <-
lsm_stage.I %>%
gamem_met(loc, name, rep, yield_LSM, verbose = FALSE) %>%
blup_indexes()
print_table(Res_ind$yield_LSM)
as_tibble(Res_ind$yield_LSM)
#> # A tibble: 75 x 10
#> GEN Y HMGV HMGV_R RPGV RPGV_Y RPGV_R HMRPGV HMRPGV_Y HMRPGV_R
#> ##### Unlist the results of Stage I and format as data.table #####
lsm_stageI <- data.table(ldply(stgI_list[, "lsmeans"], data.frame, .id="loc"))
lsm_stageI <- lsm_stageI[order(lsm_stageI$loc,lsm_stageI$name),]
lsm_stageI$units <- seq.int(nrow(lsm_stageI))
#str(lsm_stageI)
data_beans_col <- data_beans[,c("name", "loc", "expt","rep")]
data_beans_col <- data_beans_col[order(data_beans_col$loc,data_beans_col$name),]
lsm_stage.I <- merge(data_beans_col, lsm_stageI,
by=c("name", "rep", "loc"))
lsm_stage.I <- droplevels(lsm_stage.I[!duplicated(lsm_stage.I$units),])
str(lsm_stage.I)
#> Classes 'data.table' and 'data.frame': 1198 obs. of 7 variables:
#> $ name : Factor w/ 75 levels "12039","12062",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ rep : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
#> $ loc : Factor w/ 4 levels "BA","HU","SA",..: 1 2 3 4 1 2 3 4 1 2 ...
#> $ expt : Factor w/ 3 levels "BB","NB","SR": 2 2 2 2 2 2 2 2 2 2 ...
#> $ yield_LSM: num 3089 3481 3425 3713 3108 ...
#> $ se : num 401 336 404 335 401 ...
#> $ units : int 1 301 601 901 2 302 602 902 3 303 ...
#> - attr(*, ".internal.selfref")=<externalptr>
#> - attr(*, "sorted")= chr [1:3] "name" "rep" "loc"
#write.csv(lsm_stage.I,"lsm_stage.I.csv",row.names=T,quote=F)
Getting the files for the individually market classes analysis
lsm_stage.I_BB <- droplevels(lsm_stage.I[which(lsm_stage.I$expt=="BB")])
lsm_stage.I_NB <- droplevels(lsm_stage.I[which(lsm_stage.I$expt=="NB")])
lsm_stage.I_SR <- droplevels(lsm_stage.I[which(lsm_stage.I$expt=="SR")])
mod.met.metan.bb Metan MET analysis mixed_mod_bb<-
gamem_met(lsm_stage.I_BB,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:00
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.0000000000615
#> GEN:ENV 0.1748493171938
#> ---------------------------------------------------------------------------
#> Variables with nonsignificant GxE interaction
#> yield_LSM
#> ---------------------------------------------------------------------------
blup.metan.bb<- mixed_mod.b$yield_LSM$BLUPgen
blup.metan.int.bb<- mixed_mod.b$yield_LSM$BLUPint
blup.metan.means.bb<- means_by(blup.metan.int.bb,GEN,ENV)
The S3 generic function plot() is used to generate diagnostic plots of residuals of the model.
plot(mixed_mod_bb)
The normality of the random effects of genotype and interaction effects may be also obtained by using type = "re".
plot(mixed_mod_bb, type = "re")
The output LRT contains the Likelihood Ratio Tests for genotype and genotype-vs-environment random effects. We can get these values with get_model_data()
data <- get_model_data(mixed_mod_bb, "lrt")
print_table(data)
data <- get_model_data(mixed_mod_bb)
print_table(data)
print_table(mixed_mod_bb$yield_LSM $BLUPgen)
The following code return the predicted mean of each genotype for five variables of the data data_ge2.
get_model_data(mixed_mod_bb, what = "blupg")
#> # A tibble: 29 x 2
#> GEN yield_LSM
#> <fct> <dbl>
#> 1 13505 3172.
#> 2 13SR1-1 2962.
#> 3 15610 3272.
#> 4 15619 3146.
#> 5 16504 3573.
#> 6 16590 3412.
#> 7 16648 3362.
#> 8 17220 3238.
#> 9 17715 3390.
#> 10 17751 3606.
#> # ... with 19 more rows
library(ggplot2)
a <- plot_blup(mixed_mod_bb)
b <- plot_blup(mixed_mod_bb,
col.shape = c("gray20", "gray80"),
plot_theme = theme_metan(grid = "y")) +
coord_flip()
arrange_ggplot(a, b, tag_levels = "a")
print_table(mixed_mod_bb$yield_LSM$BLUPint)
data <- get_model_data(mixed_mod_bb, "details")
print_table(data)
waasb_model_bb <-
waasb(lsm_stage.I_BB,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:00
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.0000000000615
#> GEN:ENV 0.1748493171938
#> ---------------------------------------------------------------------------
#> Variables with nonsignificant GxE interaction
#> yield_LSM
#> ---------------------------------------------------------------------------
data <- waasb_model_bb$yield_LSM$model
print_table(data)
data <- waasb_model_bb$yield_LSM$PCA
print_table(data)
plot_eigen(waasb_model_bb, size.lab = 14, size.tex.lab = 14)
The above output shows the eigenvalues and the proportion of variance explained by each principal component axis of the BLUP interaction effects matrix.
data <- waasb_model_bb$yield_LSM$MeansGxE
print_table(data)
c <- plot_scores(waasb_model_bb, type = 1)
d <- plot_scores(waasb_model_bb,
type = 1,
col.gen = "black",
col.env = "red",
col.segm.env = "red",
axis.expand = 1.5)
arrange_ggplot(c, d, tag_levels = list(c("c", "d")))
e <- plot_scores(waasb_model_bb, type = 2)
f <- plot_scores(waasb_model_bb,
type = 2,
polygon = TRUE,
col.segm.env = "transparent",
plot_theme = theme_metan_minimal())
arrange_ggplot(e, f, tag_levels = list(c("e", "f")))
g <- plot_scores(waasb_model_bb, type = 3)
h <- plot_scores(waasb_model_bb, type = 3,
x.lab = "My customized x label",
size.shape.gen = 3,
size.tex.gen = 2,
#x.lim = c(1.2, 4.7),
#x.breaks = seq(1.5, 4.5, by = 0.5),
plot_theme = theme_metan(color.background = "white"))
arrange_ggplot(g, h, tag_levels = list(c("g", "h")))
i <- plot_scores(waasb_model_bb, type = 4)
j <- plot_scores(waasb_model_bb,
type = 4,
size.tex.gen = 1.5,
color = FALSE,
col.alpha.gen = 0,
col.alpha.env = 0,
plot_theme = theme_metan(color.background = "white"))
arrange_ggplot(i, j, tag_levels = list(c("i", "j")))
This index was also already computed and stored into AMMI_model>GY>model. An intuitively plot may be obtained by running
i <- plot_waasby(waasb_model_bb)
j <- plot_waasby(waasb_model_bb, col.shape = c("gray20", "gray80"))
arrange_ggplot(i, j, tag_levels = list(c("e", "f")))
scenarios <- wsmp(waasb_model_bb)
#> Ranks considering 0 for GY and 100 for WAASB | | 2% 00:00:00
Ranks considering 0 for GY and 100 for WAASB |= | 3% 00:00:00
Ranks considering 0 for GY and 100 for WAASB |= | 5% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |== | 6% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |== | 8% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |=== | 10% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |=== | 11% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |==== | 13% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |==== | 14% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |===== | 16% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |===== | 17% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |====== | 19% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |====== | 21% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |====== | 22% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |======= | 24% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======= | 25% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======== | 27% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======== | 29% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========= | 30% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========= | 32% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========== | 33% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |========== | 35% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |=========== | 37% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |=========== | 38% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 40% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 41% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 43% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============= | 44% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============= | 46% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============== | 48% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |============== | 49% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |=============== | 51% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |=============== | 52% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================ | 54% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================ | 56% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================= | 57% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================= | 59% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================= | 60% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================== | 62% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |================== | 63% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |=================== | 65% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |=================== | 67% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |==================== | 68% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |==================== | 70% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |===================== | 71% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |===================== | 73% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |====================== | 75% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |====================== | 76% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 78% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 79% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 81% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |======================== | 83% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |======================== | 84% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |========================= | 86% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================= | 87% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================== | 89% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================== | 90% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================ | 92% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================ | 94% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================= | 95% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |============================ | 97% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |=============================| 98% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |=============================| 100% 00:00:00
print_table(scenarios$yield_LSM$hetcomb)
In addition, the genotype ranking depending on the number of multiplicative terms used to estimate the WAAS index is also computed.
print_table(scenarios$yield_LSM$hetdata)
plot(scenarios, type = 1)
plot(scenarios, type = 2)
Res_ind <-
lsm_stage.I_BB %>%
gamem_met(loc, name, rep, yield_LSM, verbose = FALSE) %>%
blup_indexes()
print_table(Res_ind$yield_LSM)
as_tibble(Res_ind$yield_LSM)
#> # A tibble: 29 x 10
#> GEN Y HMGV HMGV_R RPGV RPGV_Y RPGV_R HMRPGV HMRPGV_Y HMRPGV_R
#> mod.met.metan.bb Metan MET analysis mixed_mod_sr<-
gamem_met(lsm_stage.I_SR,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:00
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.010695791
#> GEN:ENV 0.000000318
#> ---------------------------------------------------------------------------
#> All variables with significant (p < 0.05) genotype-vs-environment interaction
blup.metan.sr<- mixed_mod.b$yield_LSM$BLUPgen
blup.metan.int.bb<- mixed_mod.b$yield_LSM$BLUPint
blup.metan.means.bb<- means_by(blup.metan.int.bb,GEN,ENV)
The S3 generic function plot() is used to generate diagnostic plots of residuals of the model.
plot(mixed_mod_sr)
The normality of the random effects of genotype and interaction effects may be also obtained by using type = "re".
plot(mixed_mod_sr, type = "re")
The output LRT contains the Likelihood Ratio Tests for genotype and genotype-vs-environment random effects. We can get these values with get_model_data()
data <- get_model_data(mixed_mod_sr, "lrt")
print_table(data)
data <- get_model_data(mixed_mod_sr)
print_table(data)
print_table(mixed_mod_sr$yield_LSM $BLUPgen)
The following code return the predicted mean of each genotype for five variables of the data data_ge2.
get_model_data(mixed_mod_sr, what = "blupg")
#> # A tibble: 11 x 2
#> GEN yield_LSM
#> <fct> <dbl>
#> 1 16686 3057.
#> 2 17604 3051.
#> 3 17835 2742.
#> 4 17837 3053.
#> 5 17839 2915.
#> 6 18904 2863.
#> 7 CALDERA 3288.
#> 8 CAYENNE 3220.
#> 9 ROSETTA 3060.
#> 10 RUBY 2999.
#> 11 VIPER 3362.
library(ggplot2)
a <- plot_blup(mixed_mod_sr)
b <- plot_blup(mixed_mod_sr,
col.shape = c("gray20", "gray80"),
plot_theme = theme_metan(grid = "y")) +
coord_flip()
arrange_ggplot(a, b, tag_levels = "a")
print_table(mixed_mod_sr$yield_LSM$BLUPint)
data <- get_model_data(mixed_mod_sr, "details")
print_table(data)
waasb_model_sr <-
waasb(lsm_stage.I_SR,
env = loc,
gen = name,
rep = rep,
resp = yield_LSM,
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait yield_LSM |===============================================| 100% 00:00:00
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model yield_LSM
#> COMPLETE NA
#> GEN 0.010695791
#> GEN:ENV 0.000000318
#> ---------------------------------------------------------------------------
#> All variables with significant (p < 0.05) genotype-vs-environment interaction
data <- waasb_model_sr$yield_LSM$model
print_table(data)
data <- waasb_model_sr$yield_LSM$PCA
print_table(data)
plot_eigen(waasb_model_sr, size.lab = 14, size.tex.lab = 14)
The above output shows the eigenvalues and the proportion of variance explained by each principal component axis of the BLUP interaction effects matrix.
data <- waasb_model_sr$yield_LSM$MeansGxE
print_table(data)
c <- plot_scores(waasb_model_sr, type = 1)
d <- plot_scores(waasb_model_sr,
type = 1,
col.gen = "black",
col.env = "red",
col.segm.env = "red",
axis.expand = 1.5)
arrange_ggplot(c, d, tag_levels = list(c("c", "d")))
e <- plot_scores(waasb_model_sr, type = 2)
f <- plot_scores(waasb_model_sr,
type = 2,
polygon = TRUE,
col.segm.env = "transparent",
plot_theme = theme_metan_minimal())
arrange_ggplot(e, f, tag_levels = list(c("e", "f")))
g <- plot_scores(waasb_model_sr, type = 3)
h <- plot_scores(waasb_model_sr, type = 3,
x.lab = "My customized x label",
size.shape.gen = 3,
size.tex.gen = 2,
#x.lim = c(1.2, 4.7),
#x.breaks = seq(1.5, 4.5, by = 0.5),
plot_theme = theme_metan(color.background = "white"))
arrange_ggplot(g, h, tag_levels = list(c("g", "h")))
i <- plot_scores(waasb_model_sr, type = 4)
j <- plot_scores(waasb_model_sr,
type = 4,
size.tex.gen = 1.5,
color = FALSE,
col.alpha.gen = 0,
col.alpha.env = 0,
plot_theme = theme_metan(color.background = "white"))
arrange_ggplot(i, j, tag_levels = list(c("i", "j")))
This index was also already computed and stored into AMMI_model>GY>model. An intuitively plot may be obtained by running
i <- plot_waasby(waasb_model_sr)
j <- plot_waasby(waasb_model_sr, col.shape = c("gray20", "gray80"))
arrange_ggplot(i, j, tag_levels = list(c("e", "f")))
scenarios <- wsmp(waasb_model_sr)
#> Ranks considering 0 for GY and 100 for WAASB | | 2% 00:00:00
Ranks considering 0 for GY and 100 for WAASB |= | 3% 00:00:00
Ranks considering 0 for GY and 100 for WAASB |= | 5% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |== | 6% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |== | 8% 00:00:00
Ranks considering 5 for GY and 95 for WAASB |=== | 10% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |=== | 11% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |==== | 13% 00:00:00
Ranks considering 10 for GY and 90 for WAASB |==== | 14% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |===== | 16% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |===== | 17% 00:00:00
Ranks considering 15 for GY and 85 for WAASB |====== | 19% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |====== | 21% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |====== | 22% 00:00:00
Ranks considering 20 for GY and 80 for WAASB |======= | 24% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======= | 25% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======== | 27% 00:00:00
Ranks considering 25 for GY and 75 for WAASB |======== | 29% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========= | 30% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========= | 32% 00:00:00
Ranks considering 30 for GY and 70 for WAASB |========== | 33% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |========== | 35% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |=========== | 37% 00:00:00
Ranks considering 35 for GY and 65 for WAASB |=========== | 38% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 40% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 41% 00:00:00
Ranks considering 40 for GY and 60 for WAASB |============ | 43% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============= | 44% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============= | 46% 00:00:00
Ranks considering 45 for GY and 55 for WAASB |============== | 48% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |============== | 49% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |=============== | 51% 00:00:00
Ranks considering 50 for GY and 50 for WAASB |=============== | 52% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================ | 54% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================ | 56% 00:00:00
Ranks considering 55 for GY and 45 for WAASB |================= | 57% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================= | 59% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================= | 60% 00:00:00
Ranks considering 60 for GY and 40 for WAASB |================== | 62% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |================== | 63% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |=================== | 65% 00:00:00
Ranks considering 65 for GY and 35 for WAASB |=================== | 67% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |==================== | 68% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |==================== | 70% 00:00:00
Ranks considering 70 for GY and 30 for WAASB |===================== | 71% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |===================== | 73% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |====================== | 75% 00:00:00
Ranks considering 75 for GY and 25 for WAASB |====================== | 76% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 78% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 79% 00:00:00
Ranks considering 80 for GY and 20 for WAASB |======================= | 81% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |======================== | 83% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |======================== | 84% 00:00:00
Ranks considering 85 for GY and 15 for WAASB |========================= | 86% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================= | 87% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================== | 89% 00:00:00
Ranks considering 90 for GY and 10 for WAASB |========================== | 90% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================ | 92% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================ | 94% 00:00:00
Ranks considering 95 for GY and 5 for WAASB |============================= | 95% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |============================ | 97% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |=============================| 98% 00:00:00
Ranks considering 100 for GY and 0 for WAASB |=============================| 100% 00:00:00
print_table(scenarios$yield_LSM$hetcomb)
In addition, the genotype ranking depending on the number of multiplicative terms used to estimate the WAAS index is also computed.
print_table(scenarios$yield_LSM$hetdata)
plot(scenarios, type = 1)
plot(scenarios, type = 2)
Res_ind <-
lsm_stage.I_SR %>%
gamem_met(loc, name, rep, yield_LSM, verbose = FALSE) %>%
blup_indexes()
print_table(Res_ind$yield_LSM)
as_tibble(Res_ind$yield_LSM)
#> # A tibble: 11 x 10
#> GEN Y HMGV HMGV_R RPGV RPGV_Y RPGV_R HMRPGV HMRPGV_Y HMRPGV_R
#>